Our domain of interest is the impact of COVID-19 in the US. This spring, we have this unfortunate outbreak of virus that has threathen hundreds of thoudsands of lives int the U.S., and even more around the world. Our group hopes that, with our effort and analysis on multiple datasets about the COVID-19, we can arouse more viligance and help people to understand this disease better.
The first dataset is a comprehensive one about each U.S county, which collects information related to their weather, socio/health and COVID-19 situation. Since its size exceeds the upload limit, we decided to keep it local.
We also found a dataset about provisional COVID-19 Death Counts based on states, sex and ages that could help us understand the bigger picture.
Finally, the dataset US State COVID-19 Daily collects number of daily cases in USA, and we have a helper dataset that stores the latitude and longitude values of states and country to assist our charts.
Our summary table sumamrizes the total number of testings conducted in the U.S. It also demonstrate the number of positive and negative cases, along with the percentage of positive cases over the total number of testing in the states. The data is grouped by states so that we can compare the different situations among the states. The table is sorts descendently by the percentage of positive cases over all the testings in each state. From the table, we can conclude that New Jersy has the highest percentage for positive COVID-19 cases among all the states. We should notice that even though New York has a significantly higher number of testing conducted than New Jersy, the percentage for positive cases in NY is lower than the NJ. Another insight is some states have not conducted enough testing for COVID-19.
| State | Total Positive Cases | Total Negative Cases | Total Testings | Percent of Positive Cases |
|---|---|---|---|---|
| NJ | 140742 | 292317 | 433059 | 32.50 |
| NY | 338479 | 886580 | 1225059 | 27.63 |
| CT | 34333 | 104091 | 138424 | 24.80 |
| DC | 6485 | 24559 | 31044 | 20.89 |
| DE | 6741 | 26540 | 33281 | 20.25 |
| MD | 34061 | 135425 | 169486 | 20.10 |
| PR | 2294 | 9304 | 11598 | 19.78 |
| MA | 79324 | 322164 | 401488 | 19.76 |
| PA | 57989 | 237989 | 295978 | 19.59 |
| CO | 19879 | 88759 | 108638 | 18.30 |
| NE | 8572 | 39354 | 47926 | 17.89 |
| IL | 83017 | 388546 | 471563 | 17.60 |
| IN | 25126 | 125383 | 150509 | 16.69 |
| VA | 25800 | 129511 | 155311 | 16.61 |
| IA | 12912 | 68361 | 81273 | 15.89 |
| MI | 48012 | 259869 | 307881 | 15.59 |
| SD | 3663 | 21529 | 25192 | 14.54 |
| LA | 32050 | 195962 | 228012 | 14.06 |
| GA | 34633 | 227544 | 262177 | 13.21 |
| KS | 7116 | 46989 | 54105 | 13.15 |
| RI | 11613 | 83625 | 95238 | 12.19 |
| OH | 25250 | 192474 | 217724 | 11.60 |
| MN | 12494 | 108304 | 120798 | 10.34 |
| MS | 9908 | 87784 | 97692 | 10.14 |
| NV | 6310 | 57750 | 64060 | 9.85 |
| AZ | 11734 | 111079 | 122813 | 9.55 |
| NH | 3158 | 32391 | 35549 | 8.88 |
| WI | 10610 | 112729 | 123339 | 8.60 |
| SC | 7927 | 85208 | 93135 | 8.51 |
| MO | 10006 | 111290 | 121296 | 8.25 |
| AL | 10310 | 122908 | 133218 | 7.74 |
| NC | 15345 | 186898 | 202243 | 7.59 |
| TX | 39868 | 485828 | 525696 | 7.58 |
| FL | 41921 | 537657 | 579578 | 7.23 |
| ID | 2260 | 30418 | 32678 | 6.92 |
| WA | 17121 | 234986 | 252107 | 6.79 |
| CA | 69329 | 963526 | 1032855 | 6.71 |
| KY | 6677 | 97340 | 104017 | 6.42 |
| ME | 1477 | 22091 | 23568 | 6.27 |
| AR | 4164 | 66274 | 70438 | 5.91 |
| VI | 68 | 1115 | 1183 | 5.75 |
| TN | 16110 | 267713 | 283823 | 5.68 |
| OK | 4731 | 91379 | 96110 | 4.92 |
| NM | 5069 | 101636 | 106705 | 4.75 |
| WY | 675 | 14384 | 15059 | 4.48 |
| VT | 927 | 20327 | 21254 | 4.36 |
| OR | 3283 | 74291 | 77574 | 4.23 |
| UT | 6431 | 147053 | 153484 | 4.19 |
| GU | 149 | 3916 | 4065 | 3.67 |
| ND | 1571 | 46261 | 47832 | 3.28 |
| WV | 1371 | 63697 | 65068 | 2.11 |
| MT | 461 | 22563 | 23024 | 2.00 |
| HI | 633 | 37305 | 37938 | 1.67 |
| AK | 383 | 29570 | 29953 | 1.28 |
| MP | 19 | 2854 | 2873 | 0.66 |
| AS | 0 | 105 | 105 | 0.00 |
Our first chart is a bar chart that reflects the total hospitalized people in each state up until the lastest day the data is collected. We include this chart becasue we would like to compare the cases between states and understand the general situation the U.S is having right now. From this bar chart we could see that New York State has far more identified cases than any other states. While other states all have fewer than 10,000 cases, New York has over 73,000.
Our second chart is a map that uses circle to locate each state, and shows the positive cases of COVID-19 with the size of the circle. We included this chart becasue we would appreciate a map that can shows the geometric location and visualize the postive cases number at the same time. From this map, we can tell that the Northeast ara suffers the most, and California has quite a few cases compared to other west states. According to common knowledge, we see that the more populated is the state, the more cases it has, which is also shown by the fact that the Midwest generally has fewer cases.
Our third chart is a scatter plot that shows the daily positive COVID-19 cases from this January to mid-May. We includ this plot becasue we would like to learn the basic trend of the growth of postive cases in the U.S. We sadly found that up until the middle of March, the growth is controlled and stedy. Starting from the late March, the growth of positive cases has increased from less than 500 to almost 190,000, and the number has not been decreased at all. We intrepret that this result can be due to there were more test after late March.